Rank in Wordlist | Frequency | Word |
---|---|---|
3735 | 4 | 1,2 |
3743 | 4 | 3,5 |
4894 | 3 | 1,5 |
7064 | 2 | 0,5 |
7067 | 2 | 1,8 |
7103 | 2 | 3,3 |
7104 | 2 | 3,7 |
7112 | 2 | 5,5 |
7120 | 2 | 8,5 |
12235 | 1 | 0,6 |
Rank in Wordlist | Frequency | Word |
---|---|---|
12316 | 1 | 15% |
12549 | 1 | 45% |
12583 | 1 | 6% |
12608 | 1 | 7,76% |
12623 | 1 | 75% |
12671 | 1 | 99% |
Rank in Wordlist | Frequency | Word |
---|---|---|
244 | 43 | ." |
Rank in Wordlist | Frequency | Word |
---|---|---|
17239 | 1 | O'Malleyovi |
32149 | 1 | r'n'b |
Rank in Wordlist | Frequency | Word |
---|---|---|
12244 | 1 | 1+1 |
12245 | 1 | 1+2 |
12478 | 1 | 3+1 |
Rank in Wordlist | Frequency | Word |
---|---|---|
2483 | 6 | FOTOGALERIE/ |
7422 | 2 | INFOGRAFIKA/ |
7852 | 2 | ROZHOVOR/ |
12418 | 1 | 2/10 |
12421 | 1 | 2014/2015 |
12531 | 1 | 4/4 |
12691 | 1 | ANKETA/ |
13473 | 1 | CZ/SK |
14041 | 1 | Dědice/REPORTÁŽ |
14233 | 1 | FOTO/ |
In the last subsection of this type we look for words containing other special characters: , ( ) % & $
" ' + * = / _
Depending on the language some of these characters may be allowed within words, other will not. If words with forbidden characters do not have very low frequency there might be a problem in preprocessing.
Words containing %:
select w_id-100,freq, word from words where w_id>100 and word like "%\%%" limit 10;
3.12.1 Words with Hyphens
3.12.2 Multiwords
3.12.3 (Multi-)Words with dots